Eecient Implementation of Reduce-scatter in Mpi

نویسندگان

  • Massimo Bernaschi
  • Giulio Iannello
چکیده

We discuss the eecient implementation of a collective operation called reduce-scatter , which is deened in the MPI standard. The reduce-scatter is equivalent to the combination of a reduction on vectors of length n with a scatter of the resulting n-vector to all processors. We describe the implementation issues and the performance characterization of two new algorithms for the reduce-scatter that have been proven to be highly eecient in theory under the assumption of fully connected parallel system. A performance comparison with existing mainstream implementations of the operation is presented which connrms the practical advantage of the new algorithms. Experiments show that the two algorithms have diierent characteristics which make them complementary in providing a performance gain over standard algorithms. Our study has been carried out in the context of the MPI standard on two diierent platforms: an SP2 and a Myrinet interconnected cluster of Pentium PRO. However, most of the results reported here are not speciic for either MPI or the platforms used, and they hold in general for any message passing programming system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Interface for Eecient Vector Scatters and Gathers on Parallel Machines

An Interface for E cient Vector Scatters and Gathers on Parallel Machines Barry F. Smith Abstract| Scatter and gather type operations form the heart of most communication kernels in parallel partial differential equation solvers. This paper introduces a simple interface for de ning and applying scatter and gather operations on both distributed and shared memory computers. A key feature of the i...

متن کامل

Improving the Performance of MPI Collective Communication on Switched Networks

In this paper, we present new algorithms for improving the performance of collective communication operations in MPI. Our target architecture is a cluster of machines connected by a switched network such as Myrinet or switched ethernet. We have developed new algorithms for all the MPI collective communication operations, namely, scatter/gather/reduce, allgather/allreduce, broadcast, reduce-scat...

متن کامل

Implementation and Performance of the MPIMessage

MPI is the new standard which deenes a set of message passing operations for multicomputers and clustered systems. In comparison to other popular message passing systems, MPI provides a richer collection of functions, allowing eecient implementations , portability and excellent support for the development of parallel libraries. In this paper, we describe the implementation and performance of MP...

متن کامل

Plasma Simulation on Networks of Workstations using the Bulk-Synchronous Parallel Modely

Computationally intensive applications with frequent communication and synchronization require careful design for eecient execution on networks of workstations. We describe a Bulk-Synchronous Processing (BSP) model implementation of a plasma simulation and use of BSP analysis techniques for tuning the program for arbitrary architectures. In addition, we compare the performance of the BSP implem...

متن کامل

A Threads-Only MPI Implementation for the Development of Parallel Programs

In this paper, we present a threads-only implementation of MPI, called TOMPI, that allows eecient development of parallel programs on a workstation. The communication and context-switching overhead is reduced signiicantly compared to existing MPI implementations , by the use of threads and shared memory in place of UNIX processes and often sockets. Results demonstrate the scalability of TOMPI i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998